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Abstract 

We perform an extensive numerical study of the disordered Poland-Scheraga (PS) model for 
DNA denaturation in which self-avoidance is completely taken into account. In complement to 
our previous work, we focus here on the finite size scaling in terms of pseudo-critical temperatures. 
We find notably that the mean value and the fluctuations of the pseudo-Tc scale with the same 
exponent, the correlation length exponent Ur (for which we provide the refined evaluation Ur = 
2.9 ± 0.4). This result (coherent with the typical picture that describes random ferromagnets, 
when disorder is relevant) is at variance with numerical results reported in the literature for 
the PS model with self-avoidance, leading to an alternative scenario with a pseudo first order 
transition. We moreover introduce a crossover chain length A*, which we evaluate, appropriate 
for characterizing the approach to the asymptotic regime in this model. Essentially, below N*, 
the behaviour of the model in our study could also agree with such alternative scenario. Based on 
an approximate prediction of the dependence of N* on the parameters of the model, we show that 
following the choice of such parameters it could be not possible to reach the asymptotic regime 
in practice. In such context it becomes then possible to reconcile the apparently contradictory 
numerical studies. 


PACS: 64.60.Fr Equilibrium properties near critical points, critical exponents 

PACS; 82.39.Pj Nucleic acids, DNA and RNA bases 
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1 Introduction 


The study of disordered systems has attracted a wide interest in statistical physics. Numerical studies 
appear of critical importance in the field, as the handling of disorder can be difficultly amenable to 
analytic treatments. In any case numerical approaches appear to be the best suited for capturing 
details of interesting behaviours, such as those related to finite-size effects. 

In this background, the disordered Poland-Scheraga (PS) model for DNA denaturation stands 
as a privileged model, following several perspectives: i) as an instance of the so-called ”helix-coil 
transition” models, the model is designed to capture one of the most basic phenomenon in molecular 
biology, relative to the opening of the double-helix; ii) to render the helix-coil transition model more 
realistic, the PS model takes into account intrinsic long-range effects (associated with the weights of 
the loops in the coiled state), which can extend throughout the length of the system; iii) originally 
specifically associated with the model, numerical methods were developed to handle the statistical 
mechanics problem with long-range effects very efficiently, allowing to consider system sizes which 
are orders of magnitude larger than what would be possible with Monte-Carlo methods. 

With this respect, the PS model of DNA denaturation with self-avoidance is of particular in¬ 
terest, notably as it belongs to a more general class of polymer models for which the question of 
disorder relevance has been definitely answered only recently PEI [3], in a probabilistic mathemat¬ 
ical framework. More precisely, for this class of models, disorder is predicted to be relevant, the 
transition being expected to be at least of second order. In such context, the numerical findings of 
our previous work [3], relying on a standard finite size scaling analysis, indeed agree with a smooth 
transition, suggesting a value Vr = 2.9 ± 0.6 for the correlation length exponent. 

In contrast, other studies of the DNA denaturation model with self-avoidance UM instead sup¬ 
ported, both on numerical and theoretical grounds, the possibility of a pseudo first order transition. 
Such scenario could be in fact accommodated within theoretical descriptions of relevant models, 
such as random ferromagnets, with, in any case, disorder expected to be relevant from the Harris 
criterion point of view. Briefly, within this picture, one predicts the presence of two correlation 
lengths, which can be evidenced numerically in particular by an analysis in terms of (appropriately 
defined, sequence-dependent) pseudo-critical temperatures, since the two corresponding exponents 
should rule the scaling of the mean value and that of the fluctuations of the pseudo-Tc, respectively. 
In detail, in these studies, the exponent related to the fluctuations was found to have the value 2, 
thus coherent with a second order transition (and, from this point of view, with disorder relevance), 
whereas the one related to the mean value was found to have the value 1, thus, concomitantly, 
denoting a pseudo first order character of this transition. 

In fact, based on recent mathematical findings, it would be possible to rule out such 

scenario for the thermodynamic limit behaviour of the present model. This situation is further rein¬ 
forced by the prediction, within the same probabilistic mathematical framework, of a thermodynamic 
limit behaviour governed by a single correlation length [ 3 . 

The general situation as described above motivates the present extensive numerical study, in the 
prolongation of our previous work, since it appears desirable to clarify the overall situation relative 
to the theoretical/numerical results, all the more that in [Sj extremely large sizes were considered 
(up to N = 2 • 10®), which should in principle be appropriate for reflecting the thermodynamic limit 
properties evidenced in the theoretical studies. It is in this direction that we perform here a hnite 
size scaling analysis in terms of pseudo-critical temperatures. In any event, for the present peculiar 
model, displaying strong corrections to scaling and a slow approach to the asymptotic regime [3], 
a general deeper understanding of the hnite size behaviour is of interest both from the physics and 
biological points of view. 

More precisely, we tackle a hitherto unaddressed problem, relative to the dehnition of appropriate 
pseudo-critical temperatures in the presence of multiple peaks in quantities such as the specihc heat 
and the susceptibility, which represent as a matter of fact a salient feature of the underlying biological 
model. Focusing on the strongly non self-averaging behaviour of observables at the critical point, 
we hnd that the pseudo-Tc can be appropriately dehned as the position of the absolute maximum 
of susceptibility, for a given sequence. We notably show that the mean value and the fluctuations 
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of this observable scale with the same exponent, thereby confirming that the asymptotic regime is 
reached in our case. At variance with the numerical findings in EE], the present result is thus 
coherent with the typical picture describing random ferromagnets when disorder is relevant. 

In order to understand in depth the finite size behaviour, we further perform an extensive anal¬ 
ysis of the loop-length probability distribution, at different temperatures. The behaviour of this 
observable reveals in a particularly evident way the presence of a crossover chain length N*. Indeed, 
this quantity appears to be particularly appropriate for characterizing the slow approach to the 
asymptotic regime in the present model. Basically, for chain lengths below N*, in our study the 
behaviour of the model would also agree with the scenario of a pseudo first order transition. As 
a matter of fact, the evaluation of N* in our case, and the approximate prediction within a phe¬ 
nomenological framework of its dependence on the parameters of the model, allows us to resolve the 
possible paradox concerning contradictory numerical findings as described above. We show notably 
that, with the choice of parameters in EE], it could be not possible to reach the asymptotic regime 
in practice. 

The paper is organized as follows: we briefly review the general background in Section [2l we 
introduce the model and the observables in Section [3l where we first summarize known results for 
the pure case (13.11) . we introduce the disordered model, by defining the different parameters entering 
in the computation of the partition function by means of the recursive equations (13.21) . we describe 
possible scenarios within the framework of the analysis in terms of pseudo-critical temperatures, 
clarifying the different ways of averaging considered (I33D, we recall the approximation used for the 
power law and the details of the numerical implementation of the computations (j3.4j) : we present 
and discuss our results on the analysis with the pseudo-critical temperatures in Section [U where 
we start from the possible definitions of this observable (SU), we study the scaling of its mean 
value and of its fluctuations dMl), we compare different ways of averaging (I13D, and we show data 
on the non self-averageness parameter related to the susceptibility dH; we present and discuss 
our results on the characterization of the finite size behaviour in Section [5l where we outline the 
presence of the crossover chain length N* by the detailed study of data on the loop-length probability 
distribution (jS.ip . and we discuss our attempt to roughly estimate the dependence of this quantity 
on the parameters of the model (|5.2p ; finally, we present our conclusions in Section [H 

2 General background 

A realistic model for the DNA denaturation transition (taking into account the entropic weights 
of the loops) was introduced by Poland and Scheraga (PS) [HI El [10], to account for experimental 
melting curves, allowing their prediction [mini EH). Models for DNA denaturation were originally 
developed in the context of fundamental biological questions, relevant notably to the possible overlap 
between physical (helix/coil) and genetic (coding/non-coding) segmentations of genomic sequences 
(for an overview see |14[ 1151 EH l I17j. and references therein). Furthermore, the PS model appeared 
to be also of interest in the context of statistical mechanics: the segmentation of the DNA chain 
into helix/coil regions represents an instance of an almost uni-dimensional system [TH], which can 
undergo a phase transition because of the long range effects, associated with the entropic weights of 
the loops (with the loop-length probability distribution described by an exponent Cp). 

In fact, upon introducing the Cp value appropriate for taking into account loop self-avoidance 
effects, larger than the one describing a random walk loop, it was hypothesized early on m that a 
more complete handling of self-avoidance, taking also into account the self-avoidance of the chain with 
itself, would lead to an even larger value for the exponent, and correspondingly to a sharper transition 
in the pure case (homogeneous sequence). However the confirmation of this hypothesis 1191120] had to 
await conformal field theory results m, showing that the correct loop-length probability distribution 
exponent associated to a self-avoiding loop embedded in a self-avoiding chain in three dimensions 
fulfills the condition Cp > 2, leading to a first order singularity (with corresponding correlation 
length critical exponent i^p = 1). The first order character of the transition was initially observed 
in a numerical study of a on-lattice homogeneous model consisting of two interacting self-avoiding 
walks (SAWs) [22] . with the numerical estimation (cp ~ 2.15) in this case [2311241 in very good 
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agreement with the theoretical predictions mentioned above. 

It can be noted that, because of the difference between the link energies of £at (Adenine- 
Thymine) and £gc (Guanine-Cytosine) base pairs, DNA is an intrinsically random system in which 
the disordered variables are quenched, in the sense that the composition of the sequence in terms of 
base pairs does not vary during the denaturation process. Accordingly, to understand the behaviour 
of the system in the thermodynamic limit of chain length —>■ oo in the canonical ensemble, it 
is necessary to evaluate the quenched free energy density, by taking the average (over the random 
variables) of the logarithm of the partition function for the generated disordered configurations. 
However, it appears particularly hard to handle such quantity analytically [5^. In fact, whereas the 
equilibrium properties of the pure system were well understood, it was only recently that probabilistic 
mathematical approaches PEIEIE] allowed to definitely demonstrate that general theoretical results 
relative to the relevance of disorder [271 Eg EH ED] do also hold for the depinning transition of 
disordered copolymers. This conclusion holds in particular for the PS models with Cp > 2, despite 
the fact that such models are instances of peculiar first order transitions, characterized by the 
presence of a diverging correlation length, in a system in which the disorder couples to only one of 
the two phases. Accordingly, such cases are expected to exhibit a random fixed point corresponding 
to a second order or possibly smoother transition, described by a correlation length critical exponent 
Ur > 2jd = 2 (with, to our knowledge, no theoretical prediction available for the value). 

Whereas these theoretical determinations are appropriate to account for the behaviour of the 
model in the thermodynamic limit, in the experimental situation we are typically concerned with the 
thermal denaturation of DNA molecules of specific sequence and given finite size. In such context 
it is implicitly assumed that the intrinsic disorder is at the origin of the well-known multi-step 
behaviour of the density of closed base pairs, 9{{ei}, N, T), the order parameter of the transition as 
accessible through optical absorbance measurements. It is then classical to analyze the derivative 
of this quantity with respect to temperature {d9{{£i}, N,T)/dT), which displays distinct peaks as 
reflecting the multi-step behaviour. At this level, from the statistical mechanics point of view, the 
first order character of the transition in the pure case is in agreement with the observation that 
the steps are typically very steep, hence the corresponding peaks very sharp. On the other hand, 
the fact that GC-rich regions tend to open at higher temperatures than AT-rich ones, suggests in a 
particularly evident way the possibility of non self-averageness for densities of extensive quantities, 
in the sense that the positions and the characteristics of the steps, and of the corresponding peaks, 
are strongly sequence-dependent. 

On general grounds, it is accordingly expected m that, for N ^ oo, the temperature range where 
such behaviour is observed shrinks around the critical temperature Tc, the transition being rounded. 
In fact, it is only at the critical point that the presence of a diverging correlation length breaks down 
the argument usually used to demonstrate self-averageness of densities of extensive quantities [32], 
built on the description of the system as consisting of weakly interacting sub-systems. However the 
validity of such picture is not obvious for the present almost one-dimensional polymer model, in 
which self-avoidance represents an infinite range interaction. 

In the background of the recent mathematical findings PEI El [7], it is expected that the effect of 
disorder on this peculiar first order transition would be the same than the one predicted theoretically 
for random ferromagnets m (as further confirmed numerically in [DD] for site dilute Ising models), 
with in particular the mean value and the fluctuations of an appropriately defined pseudo-critical 
temperature scaling with the same exponent. Nevertheless it is highly desirable for the present 
model to further assess numerically this picture, in order notably to better understand the way in 
which relevance of disorder becomes manifest, as tackled at the finite size level. 

3 Model and observables 

3.1 The pure case 

For the sake of completeness, we first recall briefly the salient features of the homogeneous on-lattice 
model which was introduced in [22]: two SAWs with the same origin on a 3d cubic lattice, obeying 
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the simple rule that two monomers can occupy the same lattice point (gaining a coupling energy e) 
if and only if their positions along the two chains are identical. The thermal variables describing 
the system can be represented by the ensemble of the {sj}, with s* = 0 if the base pair in position i 
is in the open state and s* = 1 if it is in the closed state. The possible configurations are only the 
ones allowed by self-avoidance, thus introducing an infinite range interaction of entropic nature in 
an almost uni-dimensional Ising model [18]. The energetic contribution of a given configuration to 
the Boltzmann weight of the “spins” {sj} (hence the associated Hamiltonian) is then simply written 
as V. = —It noticed that e enters in the partition function only through the ratio 

s/T, allowing thus to set e = 1 without loss of generality. 

This model can in particular be assimilated to a homogeneous DNA denaturation transition 
model d la PS ig El [isi Eni El, in which the two chains represent the two strands of the double 
helix, allowing for one free-end but not base-pair mismatches. It is then interesting to notice that 
self-avoidance is fully taken into account in the on-lattice numerical simulations, by design. In fact, it 
is found [22| that, in the thermodynamic limit of infinite chain length N ^ oo, the model undergoes 
a first order phase transition with a discontinuity in the order parameter, the density of closed base 
pairs 0{T) = ^ ((') holding for the thermal average), which varies abruptly from zero value 

at high temperature to a finite value below the critical temperature Tc. The properties of the system 
are better grasped in the behaviour of the loop-length probability distribution [HEniEsllMlESI: 

(I) 

in which appear both the exponent Cp, describing the purely algebraic decay of P{T, 1) at the critical 
point, and the correlation length ^(T). In the considered case, in which we allow for one free-end, it 
could be necessary in principle to take into account an additional length scale, corresponding to the 
distance between end points [221 Ej. Nevertheless, such requirement should be basically considered 
to be associated with a boundary condition [20] . therefore not influencing the thermodynamic limit 
properties of the model. 

In general, upon solving these models in the grand canonical ensemble, the succession of closed 
segments (helix) and open loops (coil regions) sums up as a geometric series |18j . However, in the 
present case, such treatment amounts to neglect the self-avoiding interactions between different seg¬ 
ments and loops, which can be taken into account more correctly by the appropriate renormalization 
of Cp [ElEo]. Transitions in homogeneous PS models are then found to be of first order, both in 
d = 3 and in d = 2. Accordingly, they are characterized by an order parameter critical exponent 
I3p = 0, i.e. a discontinuity in 6{T), but also by the presence of a diverging correlation length, with 
related critical exponent Up = 1 (since renormalization leads to Cp > 2). In the theoretical frame¬ 
work of almost uni-dimensional phase transitions this correlation length is to be interpreted as the 
longitudinal one [I8( 120]. corresponding to the mean loop length (1) along the chain, whose variance 
diverges. More in detail, based on conformal field theory results [2T], the theoretical prediction in 
d = 3 for a self-avoiding loop embedded in a self-avoiding chain [HEo] leads to a Cp value between 
2.115 and 2.22, in very good agreement with the value Cp ~ 2.15 found in numerical study of the 
on-lattice SAW DNA model [2^ [2l| [25]. 

The behaviours at criticality expected for the singular parts of the various thermodynamic quan¬ 
tities in PS models are summarized in [Tab. 1], defining implicitly the order parameter critical 
exponent fdp, the correlation length critical exponent Up, and the specific heat critical exponent ap 
(the label p is to distinguish the pure p case from the random r one), all of them being expressible 
as functions of the sole variable Cp. It can be noticed in particular that here the energy behaves as 
the order parameter, hence the singular part of the specific heat behaves as the one of the derivative 
of the order parameter with respect to the temperature, as well as to that of its derivative with 
respect to e, i.e. to the singular part of the susceptibility. Moreover is to be reached from the 
low temperature phase, as the thermodynamic limit correlation length is infinite in the whole phase 
T > Tc (both with and without one free-end allowed). 

It can be useful to recall, in order to better specify the link between the on-lattice 3d SAW 
DNA model and the associated PS one, that the transition in the grand canonical ensemble occurs 
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Table 1: Behaviours at criticality of the singular parts of the relevant thermodynamic quantities in 
the model, which involve only one independent critical exponent (see text for details). 


in fact at a tri-critical point in the fugacity-temperature plane (see |22] and references therein), 
better described by the crossover exponent (/>p = where = l/V is the geometrical 

correlation length critical exponent (given by the inverse of the fractal dimension V of the 3d SAW), 
whereas corresponds to the thermal correlation length critical exponent. Accordingly, the relation 
Op = 2 — Up (which in PS models corresponds simply to a particular case of the well known relation 
Op = 2 — dup in d = 1), is the analogous within the almost uni-dimensional framework to the 3d 
relation Up = 2 — Vup'^ = 2 — Up'^/Ug'^ = 2 — lj4>p (with the first order case corresponding to = 1, 
i.e. Up'^ = Ug'^). On the other hand, the dehnition of P{T,l) given by Eq. ([8]) implicitly refers to the 
unidimensional structure of the system (with I the loop length along the chain), and accordingly the 
corresponding correlation length critical exponent is given by Up = 'Dup'^ = Ijcjip. It is worth noting 
that the fractal dimension P is by definition the same in the presence of disorder. 

3.2 The random case 

The DNA is intrinsically a disordered system, because of the difference in the coupling energies 
between AT and GC base pairs. To account for disorder it is then necessary to introduce a coupling 
dependent on the position i of the base pair along a given sequence, e,, writing accordingly: 

N 

n = - '^SiSi. ( 2 ) 

It is then important to distinguish thermal averages, ((•)) (over the thermal variables {sj}), from 
disorder averages, (•) (over the different possible realizations of the quenched random variables {ei}, 
i.e. the different possible sequences). 

A simple choice for the couplings [Ml El El 0] consists in considering identically distributed 
independent random variables {si}, following the binomial law: 

P{^i) = ^ - ^at) + d{ei - £gc)] , (3) 

which assumes the same mean AT and GC contents, leaving free the choice of the value for the energy 
ratio parameter R = egcI^at- In the present work, as in [3], we study the PS model corresponding 
to the disordered on-lattice 3d SAW DNA in |M], with R = 2 (obtained by taking egg = 2 and 
^AT = !)■ 
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e{{ei},N,T) = 

irE£ui{>.) 

(energy density) 

ci{ei},N,T) 


(specific heat) 

0{{ei},N,T) 

III 

(order parameter) 

x{{ei},N,T) 

( lr[((Ef..».)')-((EiiOi))'] 

1 T[-s^oae,},N,T) + ^S(le.},N,T)] 

(susceptibility) 

Table 2 

Dehnition of relevant thermodynamic observables 

in the model. 


In terms of recurrent evaluations, for a system of length n + 1 with both base pairs i = 1 and 
i = n +1 in the closed state (si = = 1), the forward partition function Z’f {{ei},n + l, T), which 

sums the contributions of all the configurations weighted by their Boltzmann factors, is obtained 
from Z'f ({e*}, re, T) as: 


re + 1, T) = 




[2(re — re' + 

Similarly, the backward partition function Z^{{ei},n — 1,T) is obtained as: 


(4) 


Z\{si},n- 1,T) = 


N 


Z’’{{e,},n,T)+ Y. 


n'=n-\-l 


Z^(fe},re',T) 
[2(re' — re + l)]‘^p 


+ 1 


(5) 


where the last term in the expression between parentheses is for taking into account the allowance 
for one free-end. 

The implementation of the model also implies a choice for the value of the connectivity constant 
fi: as well known, fj, = 2d for a d-dimensional random walk on a hyper-cubic lattice and /r ~ 4.7 for 
a 3d SAW on a cubic lattice, leading to logfi ~ 1.54 in our case [1]. 

The total partition function for a sequence {ci} of length N is given by: 


N 

Z{{ei}, N,T) = Y n, T) = Z\{ei}, 1, T). (6) 

n=l 

Therefore, it is possible to evaluate the probability for a base pair in position re to be in the closed 
state {i.e. the thermal average (si)): 


V{{ei},N,T,n) = {s^) 


Zfi{si},n,T)Z\{e,},n,T) 
Z{{ei},N,T)exp{l3en - log/i)’ 


(7) 


from which relevant thermodynamic observables (whose definitions are recalled in [Tab. 2]) are 
obtained easily. It is worth noticing that results in [Mill] are in agreement with a situation in which, 
also in the presence of disorder, the order parameter behaves as the energy (with /3,. = — 1), and 

the susceptibility as the specific heat (with = 2 —re^). Under such conditions only one independent 
critical exponent needs to be evaluated. 
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Finally, we recall that the loop-length probability distribution, apart from the normalization 
constant (imposing P({ej}, T, A^,/) = 1), can be evaluated as: 

Pi{e } AT T n oc — V T)Z^{{ei}, n + l + l,T) 

^ ^ z({ej,iv,r) ■ 

This expression further highlights the fact that in the present model Cp corresponds to an input. 

In this context, the simplest picture describing the random fixed point, in agreement with the 
numerical results in [341 [^. is the one in which in the thermodynamic limit: 

_ 

P{T,l) = hm P{{ei},N,T,l) oc (9) 

N^OO 

Accordingly, as in the pure case (see Eq. ([8])), the average loop-length probability distribution is 
expected to display a purely algebraic decay at the critical point, where the average correlation 
length ^(T) diverges. This purely algebraic decay is described by the random critical exponent Cr, 
which in the case of a smooth transition should be linked to the correlation length critical exponent 
Vr by the relation Cr = 1 + l/ur'- the same kind of relation which is known to be valid for the pure 
system, for Cp < 2. 

3.2.1 Implementation of the PS model with self-avoidance in various nnmerical studies 

The PS model considered here corresponds in a detailed way to the on-lattice 3d disordered SAW 
DNA model, with the same distribution of the coupling energies, studied numerically in [3Tj. In 
this work, by means of parallel computing, it was possible to collect sufficient statistics up to chain 
lengths N = 800. Simulations were extensively performed in the case R = 2, considering also 
different values of R. With such choices of the parameters, applying standard finite size scaling, it 
was found that the length scale considered was not large enough for reaching conclusive results. 

More in detail, the estimations obtained for (pr, which appeared moreover to depend on R, were 
smaller than pp = 1, and correspondingly the ones for Cr were smaller than Cp ~ 2.15 (more precisely 
smaller than 2). Nevertheless, both estimations for the exponents were still compatible with the 
pure case values, within the errors. In addition, no conclusive results were obtained when fitting 
data on the loop-length probability distribution at different temperatures, nor when attempting to 
make an analysis in terms of pseudo-critical temperatures. In any event, these estimations clearly 
suggested a value for the correlation length critical exponent definitely smaller than the value = 2, 
i.e. the smallest possible value which would be expected in the case of relevance of disorder on 
general grounds [2IlEElEnil3iil with such conclusion further confirmed for a class of polymer models 
encompassing the present one mmm- 

This overall situation appeared then all the more unsettled with the numerical results reported 
in ISE] for a PS model with Cp = 2.15 supporting the alternative scenario of a pseudo first order 
transition. In the perspective of a global clarification of the situation, taking into account our results 
here and in the previous work [3], [Tab. 3] reports the values of the parameters adopted in the various 
studies. This table highlights notably the differences in the values of the energy ratios R and of the 
connectivity constants p, the importance and significance of such differences being further discussed 
below. 

3.3 Scaling laws and pseudo-critical temperatures 

In the presence of a phase transition characterized by a diverging correlation length it is well known 
that the critical exponents can be evaluated numerically using finite size scaling techniques [35[ 
ESI EZ]. Such evaluations are based on the theoretical argument, that can be shown to be valid 
in the framework of the renormalization group approach [38], following which the only relevant 
adimensional ratio near the critical point is the one between the thermodynamic limit correlation 
length itself, ^{T) ~ \Tc — and the linear scale L of the system under consideration (namely 

in the present Id case N = L). In a system without quenched disorder {v = Vp) this means that 






on-lattice 
SAW model 
in [31] 

PS model 
with Cp = 2.15 
in [51 [6] 

PS model 
with Cp = 2.15 
here and in [1] 

energy ratio R 

2 

1.098 

2 

connectivity constant /r 

~ 4.7 

2 

4.7 

cooperativity factor 
(for the loops) 

I 

0.296 

I 

cooperativity factor 
(for the free end) 

I 

0.5 

I 

free-end exponent Cp 

< 10"^ 

0 

0 


Table 3: Values of parameters in the various numerical studies of disordered PS models for DNA 
denaturation taking into account self-avoidance. The cooperativity factor (Tq is to account for the 
cost of opening a loop; it was predicted in [20] that the free-end exponent c'p takes a small value. 


an observable O, with thermodynamic limit behaviour of its singular part described by the critical 
exponent Cp (limT-^r^ 0{T) oc |Tc — T|®p), is expected to follow the law: 

0{L,T) = (10) 

with O a scaling function. 

As discussed for instance in [36], in the case of the diluted Ising model (a quenched disordered 
system which is similar to the present one, notably relative to the fact that it does not display 
competing interactions, and accordingly no frustration), an analogous scaling picture is expected 
to describe random critical points, and should therefore allow to evaluate the correlation length 
critical exponent as well as the critical exponent related to the considered observable. Such 
evaluation can be performed by studying the scaling behaviour of 0({ej}, A^,T), with the average 
over the quenched variables {^i} estimated in the standard way, taking the same temperature for all 
Ms samples (here sequence realizations): 


O({e0,iV,r) 


1 

K 




( 11 ) 


Such standard finite size scaling technique was applied both for the study of the on-lattice disordered 
SAW DNA model [S] and for the corresponding PS model with Cp = 2.15 [1]. In this last work, 
involving large enough statistics with chain lengths up to V = 2 • lO'^, such approach provided 
notably numerical evidence for relevance of disorder, with the random critical point of the system 
described by a correlation length exponent Vr = 2.9 ± 0.6. 

However, it was also generally put forward [33] that, when considering random critical points, 
results of standard finite size scaling analyses should be considered with some care. Within this 
framework, following a general renormalization group result for random ferromagnets m , we are 
led to introduce an additional observable, the effective critical temperature (or pseudo-critical tem¬ 
perature), Tc{{ei}, N), studying its dependence on the disordered configurations considered. In the 
present case {d = 1 ) with L = N, the mean value of this quantity: 


n{N)=T,{{ei},N), 


and its fluctuations: 


6M{N) = [T,i{e,},N)]^ - Tci{ei},N) 


■1 2 


1/2 


( 12 ) 

(13) 
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are expected to behave as functions of the system size [SUES]: 


Tc{N) ~ Tc + CiV-i/^p 
6Tc{N) oc 

Tc{N) ~ Tc + CN-^^^- 
6Tc{N) oc iV-V^'r 


for irrelevant disorder 


for relevant disorder 


(14) 

(15) 


Noticeably, the theoretical framework m predicts that, for relevant disorder, the mean value and 
the fluctuations of the pseudo-Tc should scale with the same exponent. This same framework also 
allows to infer that disorder should be relevant as soon as the specific heat critical exponent for the 
pure system fulfills the condition > 0 (hence, from the hyper-scaling relation Op = 2 — d Up, as 
soon as the correlation length critical exponent of the pure system fulfills i^p < 2/d). 

On the other hand, it is worth noting that the behaviour of the mean value and of the fluctuations 
of the pseudo-Tc described by two different exponents in Eq. dH, corresponding to a situation 
encountered usually when disorder is irrelevant (as the specific heat of the pure system displays no 
divergence for ap < 0), could be also attributed in fact to the presence of two correlation lengths. 
The theoretical basis for such a possibility was laid within the renormalization group framework, 
notably in the case of random transverse field Ising chains [39] . Furthermore, in addition to the 
results on the PS model with self-avoidance in 15116], evidence for such scenario, corresponding to a 
pseudo first order transition, was reported in various other cases (see for instance m and mi)- 

In such context, in the case of the PS model with self-avoidance, an independently diverging 
correlation length could be pictured as related to the free-end distance, or, perhaps on more grounded 
bases, it could be hypothesized that the divergence of the typical loop is different from that of the 
average one. Accordingly, the transition would be of first order, as in the pure case, from the point of 
view of the behaviour of the typical observables (the given sequence undergoes a first order transition, 
with t'r,! = 1) and it would be of second order from the point of view of the average ones (whose 
behaviour is ruled by ^'^,2 = 2) [51E]. 

In fact, it is also obvious within this framework [331 15116]. that the standard scaling law describing 
the finite size behaviour of a thermodynamic observable O with critical exponent is expected to 
be better obeyed by the quantity O: 


0{{ei},N,T) = 


{Tc{N) - , 


(16) 


in which we label by (•) the average over disorder performed by taking into account the sequence- 
dependent Tc{{ei}, N), according to: 


0{{ei},N,T) 


1 


Y,0[{e,},N,T 

fel 


T,{{ei},N) + n{N)] 


(17) 


In such a way, one avoids that the results are governed by the fluctuations of the pseudo-Tc. 

Finally, in the case of disorder relevance, theoretical results m imply strong non self-averageness 
in the thermodynamic observables which are singular at the critical point, despite of these being 
densities of extensive quantities. By definition, self-averageness is measured from the ratio: 


TZo 


[0{{e,},N,T)f 


Oi{ei},N,T) 


0{{e^},N,T) 


(18) 


with the strong non self-averaging behaviour corresponding to TZq ~ 1 (whereas TZo ~ ^/N in the 
usual self-averaging behaviour). Noticeably, these results for observables such as the order parameter 
and the susceptibility are expected in the present model both in the case where the mean value and 
the fluctuations of the pseudo-critical temperature scale with the same exponent and in the case of 
a pseudo first order transition. 
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Indeed, this behaviour has been observed in the order parameter in the PS models with different 
Cp values studied in [6], and in particular in the peculiar case of Cp = 2.15. On the other hand, for the 
PS model here (as already considered in [3]) as well as for the on-lattice SAW DNA (as considered in 
[23]), such behaviour was clearly suggested by the presence of multiple steps in the order parameter, 
and correspondingly of several peaks in the susceptibility, in a not negligible fraction of the sequences, 
already for relatively small chain lengths. 

Within this background, a rather subtle point concerns the definition of the pseudo-Tc itself. 
From this point of view, in two different definitions were introduced, with their study leading 
to the same conclusions: one is obtained from the free energy behaviour and the other, which we also 
study, as the crossing point of the order parameter 9{{ei},T, N) curves, considering sequences of 
increasing lengths, obtained by the concatenation of increasing number of copies of a given original 
sequence. 

Importantly, the absence of multi-step behaviour in the order parameter in (31 [n], for the PS 
model with Cp = 2.15, represents the main qualitative difference with the results obtained for the 
on-lattice disordered SAW DNA or for the corresponding PS model in [Mill]. Indeed, to the best 
of our knowledge, from this point of view the present work should represent the first attempt to 
define a pseudo-critical temperature with the order parameter displaying a multi-step behaviour. 
Noticeably, various definitions of the pseudo-Tc have been considered, such as for spin glasses m) 

On the other hand, in [33] the pseudo-Tc was defined as the temperature corresponding to the 
unique maximum of the susceptibility. We are led to generalize this definition, and to take as 
pseudo-Tc the temperature at which the susceptibility reaches its absolute maximum (also checking 
the agreement of the scaling law with the alternative definition involving the concatenated sequence 
procedure). This choice, made possible by the accurately measured temperature dependence of the 
observables in the present study, appears generally to be the most reasonable one and it should 
be moreover particularly appropriate to allow the identification of a possible pseudo first order 
transition. In fact, when defining the pseudo-Tc in this way, by applying Eq. (II7I1 one finds: 


max{x({ei}, iV, T)} = max{x({ej}, N, T)}. (19) 

Accordingly, such definition should be effective for providing evidence for a diverging behaviour 
of the typical susceptibility (hence, in the present case in which the susceptibility behaves as the 
specific heat, for a specific heat exponent = 1 = Op, and also, by hyper-scaling in d = 1, for 

I'r,! — 1 — ■ 

It was further hypothesized in [6] that the presence of two correlation lengths ruled by different 
critical exponents could be inferred from the probability distribution of the loop lengths. Qualita¬ 
tively, one would expect in particular different behaviours for log T({ej}, N, T) and log P{{ei}, N, T), 
respectively. Even though such conclusion did not appear to be confirmed from the results in [3], 
with the data at T = Tc not displaying asymptotically detectable differences, it appears desirable 
to further investigate this point in detail, with careful analysis of data on the whole temperature 
range. 

3.4 The SIMEX scheme and numerical implementations 

Eor efficient numerical implementation, the recursive equations for the forward and backward parti¬ 
tion functions in the PS model are solved with the SIMEX scheme (SIMulations with EXponentials), 
resorting to an approximation of the power law 1/l‘^p with a sum of exponentials. The basic idea in 
the SIMEX scheme, as used in the context here BUM, was originally expressed in [43] . specifically 
for the numerical study of PS models of denaturation transitions in linear DNA molecules. However, 
the generality of the powerful idea at the basis of this representation of the long-range effect as a sum 
of exponentials was not appreciated to its fair value, as in the original work |33| it was implemented 
in the context of conditional probabilities specific to the model considered. 

The formulation of the SIMEX proper proceeded then in two steps. Eirst the original idea was 
reformulated in more general terms for the linear PS model, in the context of recursions written 
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directly in terms of partition functions [Sj. It was then possible, on such basis, to propose gener¬ 
alizations of the idea to higher order models involving several mutually coupled long-range effects 
|45j ■ with the corresponding algorithmic complexities reduced by several orders of magnitude. In 
order to be effective such a method relies on the accurate representation of long-range effects (such 
as the probability law for the lengths of the loops, in the simplest case of the model here) as sums of 
exponentials. The Pade-Laplace method [Ml HZ] provides an elegant generic solution to this prob¬ 
lem, not only in the case of purely decaying functions, such as power-laws (represented as sums of 
real exponentials), but more generally allowing to represent functions with complex exponentials. 
As an illustration for potential applications, the SIMEX scheme was used for example to implement 
sequence alignments with realistic gap models, in bioinformatics |48] . 

In this background, with the approximation here. 


I 

{21Yp 


Ne 

fc=i 


( 20 ) 


it is important to stress that, from the analytic point of view, replacing the power-law with exponen¬ 
tials is expected to change the nature of the singularity in the thermodynamic limit partition function. 
However, on numerical grounds, such replacement is not expected to influence the results of hnite 
size scaling analyses. Indeed, with the accuracy of the multi-exponential representation adopted 
(see below), the numerical approximation is practically indistinguishable {i.e. within prescribed 
reasonable limits) from the analytic expression, well beyond the largest length scale considered. 

In this direction, because of the importance of the underlying numerical problem, it is relevant 
to further recast the obtention of multi-exponential representations in the general context of ap¬ 
proximation problems. Since Prony (1795) the problem of obtaining numerical representations of 
given functions as sums of exponentials has been addressed in many fields and contexts. As a matter 
of fact, this problem can be declined in two very different -in principle- flavours: identihcation or 
approximation. In the identification case the given function is supposed, by essence, to be a sum 
of exponentials and the problem consists in retrieving precisely the genuine number of exponential 
components, with the estimates of the associated parameters. This problem is reputed difficult, 
notably in the case of real exponentials, because of ill-conditioning. It is then easy to over-fit the 
data, with methods such as least-squares, with increasing number of components, thus missing the 
aim of proper identification of the model in terms of its original components. In the alternative 
case, related to approximation, the problem is of course completely different, as the given function, 
known analytically, is not a sum of exponentials and the aim is to obtain indeed the best possible 
approximation under such form. The present situation, concerning the power-law for the long-range 
effect, is of course relevant to the approximation case, and we need to ensure an accurate representa¬ 
tion of the power-law up to the largest considered sizes for the system with the sum of exponentials. 
The number of components in the multi-exponential representation will then of course depend on 
the maximal size of the system considered, with the need to introduce, according to this size, addi¬ 
tional components with increasingly smaller bk parameters (see Eq. (I20p ): in the numerical fit, for 
increasingly larger values of the variable I, close to the maximal one in the model, the exponential 
components with the smallest bk parameters are required to decay according to the corresponding 
values of the power-law. 

In such context, the Pade-Laplace method (which encompasses various other formulations such 
as the Prony method or the method of moments as particular cases) was originally formulated for 
the identification problem. It was however also used in the approximation context, for the obtention 
of numerical approximations of the power-law with sums of exponentials (see with this respect 
[Fig. 4] in reference |46]L Interestingly, in such case, an identification-like behaviour was observed, 
concerning the number of exponential components in the representation: more precisely, it was 
observed that the number of components appropriate for systems of size N^ax followed essentially a 
law in ln{Nmax) (10 components for N^ax up to 2 • 10^, 14-15 components for Nmax up to 10®, such 
as in the genomic analyses miiisKis]). Of course, within the approximation context, the original 
model for the multi-exponential representation as obtained by the identihcation procedure can be 
further used as initial guess for least-square hts, for further rehned approximation. In various studies. 
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N 

Ms 

T 

^ min 

T 

^ max 

100 

2000 

0.95 

1.2 

200 

2000 

0.95 

1.2 

500 

2000 

1.0 

1.16 

750 

1000 

1.0 

1.16 

1000 

1000 

1.0 

1.15 

2500 

1000 

1.02 

1.14 

5000 

1000 

1.02 

1.14 

7500 

1000 

1.04 

1.12 

10000 

600 

1.04 

1.14 

15000 

500 

1.04 

1.12 

20000 

500 

1.04 

1.12 


Table 4: Number of sequences (Ms) and temperature intervals {[Tmin,Tmax]) adopted in the com¬ 
putations, following the chain lengths {N). 


including in the original Fixman-Freire paper [53], multi-exponential representations for the power- 
law were obtained resorting to different methods. It is however interesting to notice that increasingly 
more complex models implemented in the context of the generalizations of the SIMEX, could involve 
non purely decaying general long-range effects. In such case it would then be necessary to resort to 
approximations with sums of general complex exponentials, as allowed by the Pade-Laplace method 

[5B1157] . 

Here, in order to reproduce accurately the behaviour of 1/(2/)^'^®, we implement a SIMEX 
scheme with Me = 15 exponentials [5], adopting the same representation (in term of values for the 
coefficients {{ak, b^) k = 1,... ,Me}) than the one in [5l|6], which proved to be adequate for chain 
lengths of order N = 2- 10®. Eurther details for the numerical implementation of the SIMEX scheme 
are provided in the Appendix in [5]. 

Summarizing, we study the disordered PS model with Cp = 2.15 by adopting the values R = 2, 
a = 1 and log [i = 1.54. The conditions chosen to collect statistics for each chain length N are recalled 
in [Tab. 4] [5], in terms of number Ms of different sequences {si} used and range of temperature 
intervals considered {[Tmin,Tmax]] the temperature intervals being always divided into Mt = 250 
equally spaced sub-intervals). 

Eor each sequence, the specific heat is evaluated by computing numerically the derivative of the 
energy density with 6T = {T^ax —Tmin)/MT, and for the numerical computations of the derivatives 
in the susceptibility the values Seat = /3 10“"^ and 5egc = R ^£at are adopted. It was checked 
that such choices ensure appropriate accuracy for the calculations (with notably the errors on the 
positions of maxima consistently smaller than the sample-to-sample fluctuations) |5]. Einally, the 
errors on average quantities are evaluated from sample-to-sample fluctuations. 

4 Study with pseudo-critical temperatures: results and insights 

4.1 Pseudo-definitions and properties 

Plot of the order parameter 9 as function of the temperature is represented in [Pig. [I], for a given 
sequence of length N = 2500, and for sequences obtained by concatenating t = 2 and t = 4 times 
the original sequence. In the following [Pig. [2], we present data on the specific heat c and the 
susceptibility y for the same given sequence of length N = 2500. We can observe in [Eig. [T] a clearly 
defined multi-step behaviour in the order parameter plot, with correspondingly three distinct peaks 
in the specific heat and in the susceptibility plots in [Fig. [2], even for this relatively short sequence 
considered. 

We first notice, comparing the behaviours in [Fig. [2^] and [Fig. [2l)], that the pseudo-critical 
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temperature Tc{{si}, N), defined as the temperature at which the susceptibility reaches its absolute 
maximum, is very close to the corresponding one obtained from the specihc heat plot. This conclusion 
holds independently of the choice of a specific sequence. In fact, as already observed in [Ij, for the 
parameter values used (notably ratio R = 2, between energies associated with GC and AT links), 
for each disordered configuration, the specific heat, the derivative with respect to temperature of 
the order parameter and the susceptibility always display very similar behaviours. In particular, as 
detailed below, it appears that the disorder averages of specific heat and of susceptibility (referring 
either to standard averages (•) or to averages which take into account the sequence-dependent pseudo- 
critical temperature (•)) display the same critical behaviour at Tc and the same kind of hnite size 
corrections to scaling. 



T 

Figure 1: Order parameter 0 plotted as function of the temperature for a given sequence of length 
N = 2500 and for the sequences of lengths N = 5000 and N = 10000 obtained by concatenating 
t = 2, respectively t = 4, copies of the original sequence. 




Figure 2: a) Specific heat c; b) susceptibility x- Data are for the same given sequence of length 
N = 2500 considered in the previous [Fig. [T], plotted as function of temperature. 


Interestingly, despite the multi-step behaviour, we observe in [Fig. [T] that the densities of closed 
base pairs 9{{ei}, N,T), 9{{ei},2N,T) and 9{{ei},4:N,T), respectively, cross at a well defined T- 
value, thus providing, following 13 E], a different possibility for defining a pseudo-critical temperature 
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T^{{ei}, N), which we also study here. It appears reasonable to assume that the scaling properties 
of this observable should not depend on the particular way it is defined. However, to the best of 
our knowledge, there are no previous attempts for defining and studying in detail pseudo-critical 
temperatures in the presence of several distinct peaks in the specific heat or in the susceptibility and 
our definition of Tc{{ei}, N) as the position of the absolute maximum of the susceptibility appears 
reasonable in such case. In any event, in the present work, we checked that with our choice we get 
results compatible with those obtained with the alternative definition above, for chain lengths up to 
N = 2500. 



T 


Figure 3: Susceptibility as function of temperature, plotted for independently randomly generated 
sequences of lengths N = 2500, N = 5000 and N = 10000, respectively. The randomly generated 
sequence of length N = 2500 is the same than the one used in the previous figures, hence the 
corresponding susceptibility is in particular the same than in [Fig. [^b]- 


Moreover, checking the behaviour of the order parameters for sequences obtained by concatena¬ 
tion of t copies of the original sequence, we can observe in [Fig. [T] that the multi-step behaviour 
becomes less evident for increasing chain lengths. To better understand the meaning of this observa¬ 
tion, we consider, in [Fig. [3] and [Fig. |3], the evaluations of the susceptibility for different possible 
instances of sequences of increasing lengths, involving the original sequence of length N = 2500, for 
which this observable displays three distinct peaks. In detail: in [Fig. [3] the sequences of lengths 
N = 5000 and N = 10000 are randomly generated, with accordingly no correlations between the 
three increasingly longer sequences; in [Fig. Ub]; the sequences of length 2N correspond to the con¬ 
catenation of the original sequence of length N with a randomly generated sequence of length N ; in 
[Fig. |3)d] the sequences of length tN correspond to the concatenation of t instances of the original 
sequence of length N. 

Most strikingly, in all these instances, increasing values for N do not appear to be associated 
with the concomitant appearance of a larger number of peaks. Nevertheless, we can observe in 
these figures interesting qualitative differences between the three cases. In [Fig. [3], the behaviour 
of susceptibility for the longest sequences do not appear to be correlated with that of the shortest 
one, with notably such behaviour changing completely between N = 2500 and N = 5000 and the 
position of the absolute maximum for the sequence of length N = 5000 corresponding to the position 
of a minimum for the sequence of length N = 2500. On the contrary, in [Fig. |1], the presence of 
correlations between sequences of various lengths appear to be associated, not surprisingly, with 
concomitant correlations in the plots of the corresponding susceptibilities. 

More specifically, it can be noticed in [Fig. 0] that in case a), which is the closest to the 
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Figure 4: Susceptibility as function of temperature, plotted for given sequences of lengths N = 2500, 
N = 5000 and N = 10000, respectively. The randomly generated sequence of length N = 2500 is the 
same in the two plots, corresponding to the one used in the previous figures, hence the corresponding 
susceptibility is in particular the same than in [Fig. [2]3]. a) The sequence of length N = 5000 
coincides with the original sequence of length N = 2500 for the first half, with the second half being 
randomly generated and, similarly, the sequence of length N = 10000 coincides with the sequence of 
length N = 5000 for the first half, with the second half being randomly generated; b) The sequences 
of length N = 5000 and N = 10000 correspond to the concatenation of t = 2, respectively t = 4, 
copies of the original sequence of length N = 2500. 


conditions of the Monte-Carlo like numerical studies on the disordered on-lattice SAW DNA model 
|34j . significant correlations between susceptibility plots are observed for the longest sequences {N = 
5000 and N = 10000), whereas in case b), which corresponds to the case of sequences generated 
by concatenations of a same original sequence, strong correlations between susceptibility plots are 
observed for all sequence lengths. In case b) it can be further noted that, with increasing chain 
lengths, the position of the absolute maximum for the susceptibility appears to approach the position 
for which crossing of the order parameters is observed in [Fig. [T]. 

As a matter of fact, the comparisons above highlight the importance of the protocols adopted for 
the generation of increasingly longer sequences. Moreover, in agreement with the picture proposed in 
[1], and as further detailed below, these comparisons also suggest the importance of the contribution 
of rare regions in the sequences to the thermodynamic limit behaviour of the disordered system 
considered. Indeed, in the concatenation scheme, as can be noticed notably in [Fig. Hb], the 
maximum length for such rare regions is strictly restricted by the maximum length of rare regions 
in the original sequence used. 

4.2 The scaling behaviours of Tc{N) and 6Tc{N) 

The values of Tc{N) = Tc{{ei}, N) and 6Tc{N) = {[Tc(A^) — Tc({ei}, as functions of 1/A^, 

are plotted in [Fig. [5], along with the best fits obtained following Tc{N) oc Tc + C and 

STc{N) oc hence allowing the possibility, in principle, of two different exponents (see 

Eqs. (fTH) - (fT5]) and the associated discussions). In addition, data for the mean value and for the 
fluctuations of N), defined according to [5l[6], are also plotted in [Fig. [5] for A" < 2500. 

For each A-value, we also checked that the behaviour of Tc({ej}, A) agrees with a Gaussian distri¬ 
bution. We notice first of all that, interestingly, whereas strong finite size corrections are observed 
in the behaviour of average quantities when applying standard scaling laws [1], it appears that data 
on the mean value and on the fluctuations of the pseudo-critical temperature agree well with the 
corresponding expected laws on the whole A-range considered. 

On such basis it is then straightforward to determine whether the values of the exponents are 
different, as associated to typical and average quantities, respectively. Indeed, it is immediately 
obvious from the figure that the data for Tc{N) and STc{N) (as the ones for T^{N) and 6T^{N)) 
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No 

Tc 

l/v,l 

1/V,2 

100 

1.098±0.001 

0.33 ± 0.03 

0.405 ± 0.01 

200 

1 .101±0.002 

0.27 ± 0.04 

0.38 ± 0.01 

500 

1.101±0.003 

0.27 ± 0.06 

0.37 ± 0.01 

750 

1.099±0.003 

0.32 ± 0.08 

0.365 ± 0.01 

1000 

— 

— 

0.34 ± 0.02 

2500 

— 

— 

0.33 ± 0.06 


Table 5: Estimations of Tc, l/t'r,! and obtained disregarding chains of length N < Nq, for the 

different Nq values considered. For and l/t'r,! estimations are from Tc{N) data, and for l/i'r,2 
from 6Tc{N) data (see text). 


display essentially the same A^-dependence. This result implies that the pseudo first order transition 
scenario cannot describe the observed behaviour in the present case, since the mean value and the 
fluctuations of the pseudo-Tc are ruled by the same exponent Vr (which appears to be larger than 
2 ). 
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1.075 

1.07 

1.065 

1.06 

1.055 

1.05 
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Figure 5: a) Plots of the mean values, Tc{N) and T^{N), of the pseudo-critical temperatures as 
functions of 1/N. Tc{N) values are plotted for the various chain lengths (red marks with error bars), 
corresponding for a given N to the mean value of Tc{{ei}, N), associated with the temperature for 
which the susceptibility reaches its absolute maximum, for the various sequences. In addition, mean 
values of T/{{ei}, N) are plotted for N < 2500 (green marks with error bars), defined as the crossing 
points of the order parameters 6{{ei}, tN, T), for t = 1,2,4. The dotted line corresponds to the best 
fit of Tc(iV), according to the scaling law Tc{N) ~ Tc -|- (with Tc = 1.098 ± 0.001 and 

l/^r,i = 0.33 ± 0.03). b) Plots of the fluctuations STc{N) and 6T/.{N), associated with Tc{N) and 
T/{N) in a), as functions of 1/N. The dotted line corresponds to the best ht for 5Tc{N), according 
to the scaling law 6Tc{N) oc (with l/v,2 = 0.405 ± 0.01). 


For further deepened analysis, [Tab. 5] reports estimations of Tc and l/v,i (from Tc{N)), and 
of 1/^,2 (from 5Tc{N)), obtained disregarding chain lengths N < Nq, for different Nq values. This 
analysis does not reveal any obvious hnite size corrections to scaling in the behaviour of Tc{N). On 
the other hand 1/^,2 appears to display a weak dependence on Nq, decreasing to the value 1/^,2 ~ 
0.33 -7- 0.34 when only chain lengths N > 1000 are considered. Thus we are led to similar values 
for the two exponents, with l/v,i = l/v,2 = 1/v = 0.35 ± 0.05, and accordingly to the estimates: 
V = 2.9 ± 0.04, and v = 1 -k 1/v = 1-35 ± 0.05. It is moreover possible to notice that analysis of 
the mean value and of the fluctuations of T/{{ei},N) leads to estimates (l/v,i = 0.44 ± 0.06 and 
l/i^r,2 = 0.38 ±0.01) which are consistent with the previous ones, even though in this case only chain 
lengths N < 2500 are considered. 

Pseudo-critical temperatures appear thus to represent particularly interesting observables for the 
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model here, with their scaling behaviour clearly in accordance with the typical scenario corresponding 
to relevance of disorder, the same than the one in the analysis of random ferromagnets [3ll|33]: the 
numerical analysis provides evidence for a smooth phase transition (as already found in [1]), the 
thermodynamic limit behaviour being ruled by a single correlation length, in agreement notably 
with the mathematical findings in [H El El E]. More precisely, our present best estimation of the 
random critical point correlation length exponent {I'r = 2.9 ± 0.04) is perfectly consistent with the 
previous evaluation in [3] , obtained through the scaling of the maximum of the specihc heat averaged 
over disorder in the standard way. 

Importantly, it is also clear that, for the pseudo-critical temperature oriented analyses, asymp¬ 
totic behaviours appear to be reached for chain lengths shorter than those considered in the present 
study (up to = 2 • 10^), despite the strong finite size corrections to scaling characterizing the 
behaviour of the various thermodynamic observables in this disordered model IMHH, This situation 
is expected to be related to the choice of parameters [3], and in order to reach a quantitative descrip¬ 
tion of the dependency of the behaviour of the model on these parameters it would be relevant to 
determine a crossover chain length N* , below which the present data could also agree with a pseudo 
first order transition. 

With this respect, in order to grasp the behaviour of the model in the thermodynamic limit, 
looking in detail to [Fig. E], it appears important on qualitative bases to have N > 1000 (he. 
1/A^ < 0.001). Indeed, with N < 1000, the average Tc({ej}, W), as a function of 1/A^, would appear 
instead to be adequately fitted with a straight line, while this is not true for 6Tc{{ei}, N). Thus the 
behaviour of these observables would be in agreement with a transition with i'r ,2 > — ’^p — 

Accordingly, N* ~ 1000 can be retained as the evaluation of a crossover chain length suggested by 
data on the mean value and on the fluctuations of the pseudo-Tc in our case. Clearly, such value 
corresponds to an evaluation from below, as the considered observables appear to be the less affected 
by corrections to scaling. 

4.3 Comparisons between different ways of averaging 

The behaviour of 9{{£i},T, N) is plotted in [Fig. [6], following the two dehnitions of pseudo-critical 
temperatures that we considered. Indeed, the present analysis aims to understand at a qualitative 
level the general results on quantities averaged by taking into account the pseudo-Tc . With this 
respect the plots in [Fig. E] can be compared to those presented in [3], in which the standard 
averaging method was used. In any case, we do not perform any finite size scaling analysis for the 
order parameter as, independent of the averaging method, the behaviour of this observable is not 
in agreement with the expected scaling law (hence with iV^“^/^’’0[(Tc — with 9 a scaling 

function, which can be derived from Eq. (|16p with = /3r = ~ !)• This feature was also observed 

both in [5] and in our previous work [3]. 

From a qualitative point of view, the data displayed in [Fig. [6] show unambiguously that, with 
varying N, the order parameters do not cross at the same temperature. This situation stands in 
sharp contrast with the one characterizing the pure model and also with the results reported in 
UM- As a matter of fact, this feature, clearly in disagreement with the possibility of a pseudo first 
order character for the behaviour of the model, was already observed in [3]. With the analyses here 
it further appears that this result do not dependent on the averaging method used. In fact, this 
result concerning crossovers becomes even more obvious for shorter chain lengths upon looking at 
the behaviour of 9. Finally, this observation holds independently of the definition of the pseudo- 
critical temperature, even though the positions of the crossing points appear to vary in fact still 
more rapidly with N when averaging is performed by taking the pseudo-Tc as the abscissa of the 
absolute maximum of the susceptibility. Thus, the analysis in terms of pseudo-critical temperatures 
appears to reduce, in general, the importance of the finite size effects, hence allowing notably to 
extrapolate the correct thermodynamic limit behaviour from shorter chain lengths, {i.e. decreasing 
the effective N* value). 

For further clarification, we finally consider in detail the behaviour of the average susceptibility, 
whose evaluation is expected to be the most sensitive to different ways of averaging. Indeed, notably 
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Figure 6: Plots of 9{{ei},N,T)\ a) the pseudo-critical temperature is defined as the value for which 
the absolute maximum of the susceptibility is reached; b) for chain lengths N < 2500, the pseudo- 
critical temperature is defined as the crossing point of the plots of 0({ej}, tN, T) (for f = 1, 2 and 4, 
obtained from the concatenation of t copies of the original sequence). 


in the context of the study here, resorting extensively to the definition of the pseudo-critical temper¬ 
ature as the position of the absolute maximum of susceptibility for a given sequence, the quantity 
x({ej},T, iV) is expected to best highlight, with its possible divergence, a pseudo first character in 
the behaviour of the model (see Eq. (jl9M . Moreover, noticeably, the maximum of this quantity 
is expected to behave as a typical quantity, in the sense to be the observable the less affected by 
fluctuations in the pseudo-Tc itself (see Eq. (HI). From this point of view the study of this quantity 
is also expected to best highlight possible differences between typical and average behaviours. 
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Figure 7: a) Plots of x({ej},T, A^) and N) for chain length N = 2500 as function of the 

temperature, b) Plots of the corresponding maxima as function of the system size. In the first case 
averaging is performed in the standard way (same data as in HD- In the second case averaging is 
performed taking into account the pseudo-critical temperature, following the two definitions con¬ 
sidered {i.e. with Tc{{ei}, N) the temperature where, for the given sequence, susceptibility reaches 
its absolute maximum and with T^({ei}, N) the temperature corresponding to the crossing point of 
the order parameter, for the given sequence, with those of the sequences obtained by concatenating 
variable number of copies of the given sequence). 


In detail, for the susceptibility data the results, for N = 2500, are plotted in [Fig. [7^] for 
x{{£i},T, N) (as obtained from Eq. (jllj) and corresponding to the same data as in m and for 
xi{£i}, T, N) (as obtained from Eq. (fTTll l. using the two definitions of the pseudo-critical temperature 
{Tc{{£i}, N) and T^{{ei}, N), respectively). It is then obvious from the figure that the possible pseudo 
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first order character of the behaviour of the model is indeed suggested the most clearly by data on 
X evaluated by defining the pseudo-Tc as the abscissa of the absolute maximum of susceptibility 
(for a given sequence), i.e. Tc{{ei}, N), with the average susceptibility reaching a dehnitely highest 
maximum in this case. The observation is in agreement with the expected result, thereby confirming 
that the considered Tc({ej}, N) is particularly appropriate for assessing disorder relevance. Moreover, 
in this context it becomes easy to determine the behaviour of the model with respect to the typical 
scenario. 

In fact, when looking at the scaling behaviours of the maxima values as functions of N in [Fig. 
Eb], a qualitative good agreement is found between the different ways considered for performing the 
averages. In particular, in all cases the maximum of the average susceptibility displays a crossing 
to an independent regime for the largest considered chain lengths, clearly showing that there is 
no difference in the behaviour of typical and average quantities. The data therefore further support 
a smooth transition (with Xr = oir < 0, and accordingly = 2 — > 2). From a different point 

of view, these Endings also clearly confirm qualitatively that the thermodynamic limit behaviour 
of the model is described by a single correlation length, in agreement with the result following 
which the mean value and the fluctuations of the pseudo-Tc scale with the same exponent (namely, 
Vr = 2.9 ± 0.4). It is worth recalling that in our previous work [4], by fitting the data for the 
maximum of the specific heat (averaged in the standard way) to the law Ci — we obtained 

(with the hrst two points disregarded) = —0.3 ±0.1, and hence Vr = 2.9 ±0.6. Hence, it would 

be meaningless to repeat the analysis on y here. 

Inasmuch as the data the data for max'r{x({ej},T, A^)} display an abrupt change as function of 
A^, it is all the more appropriate to introduce a crossover chain length AI*, for characterizing the 
slow approach of the model to the asymptotic regime. More precisely, we observe a shift from a 
short chain increasing behaviour to a long chain nearly constant one, around a value of N* ~ 2500, 
which is compatible both with the estimate from below {N* ~ 1000) obtained qualitatively from the 
scaling of the mean value and of the fluctuations of the pseudo-critical temperature and with the 
(more accurate) estimate from above {N* ~ 2.500 -i- 5.000) obtained according to the behaviour of 
the loop-length probability distribution (see below). 

4.4 Non self-averageness parameter related to snsceptibility 

As already recalled, both in the typical case of disorder relevance and within the pseudo first order 
transition scenario, one expects strong non self-averageness in the thermodynamic observables which 
are singular at the critical point. Accordingly, the parameter dehned by Eq. m, which measures the 
relative fluctuations of the observable averaged over disorder in the standard way, should display a 
constant behaviour instead of decreasing as a function of the system size N at Tc. In fact, numerical 
evidence was reported in [6] for such a behaviour of the non self-averageness parameter related to 
0, for the disordered PS models considered with different Cp values, and notably for the one with 
Cp = 2.15. It is therefore reasonable to assume that the same result should hold in the present system. 
Indeed, intuitively enough, since the single sequence order parameter displays multi-step behaviour 
in a signihcant fraction of the samples, the sequence-to-sequence fluctuations of this observable, 
averaged in the standard way, should play a role at least of the same importance than in the model 
studied with Cp = 2.15 in [6], in which this behaviour is not observed. 

In this background, we instead focus here on the non self-aver ageness parameter related to the 
susceptibility, TZ^{N,T), whose behaviour as function of T is plotted in [Fig. [8], for the different 
sizes considered. First, letting aside the two shortest lengths N = 100 and N = 200, in the plots of 
TZy.{N,T) no evident dependence on the chain length N is observed for the heights of peaks, thus 
conhrming the expected strong non self-averageness of this observable in the present model. The 
plots in [Fig. [5] display rather irregular behaviours in the whole high-T region, even though it is in 
general expected that TZ ~ 1/N both above and below the critical point. This observation can be 
explained by the fact that here the disorder couples only to the low-temperature phase, with the 
order parameter being zero in the thermodynamic limit for T > Tc, where the two DNA strands 
are only linked at the origin. It is then particularly difficult to evaluate correctly in this region the 


20 




Figure 8: Plots of the non self-averaging parameter related to susceptibility, Tl^{N,T), as function 
of the temperature, for the different chain lengths considered. No accurate evaluations of the errors 
are performed, even though they are expected to be at most of the order of the observed oscillations 
of the quantity. 


fluctuations due to disorder of the average susceptibility. 

Nevertheless, it is interesting to notice in [Fig. [8], that the plots for different A^-values display 
similar behaviours near the thermodynamic limit critical temperature Tc — 1.1. In particular, a 
steep increase is observed in the plots immediately above Tc — 1.1, with the steepness increasing 
with the size, and for temperatures approaching Tc from above for increasing sizes. In more quan¬ 
titative terms, for the different iV-values considered, the position of the highest TZ^{N,T) peak, 
corresponding to the best evaluation of the abscissa of its absolute maximum as allowed by our 
present statistics, appears to be interpretable as the mean value of a quantity behaving as a pseudo 
critical temperature (or in any event as an A^-depending evaluation of the thermodynamic limit 
critical temperature Tc), which we can denote T^{N). In fact, we checked that the scaling law 
T^{N) =Tc + is also valid for this observable. With respect to the previous cases, corre¬ 

sponding to the two different definitions of Tc{{ei}, N), it is clear that the constant has the opposite 
sign, with anyway the ht leading to compatible estimations of Tc and Vr- 

5 Characterization of the finite size behaviour: results and insights 

5.1 Crossover to the asymptotic regime 

The quantity: 

P'iisi}, N, T, 1) = N, T, /), (21) 

was introduced in our previous work [1], as more appropriate for capturing relevance of disorder, than 
the loop-length probability distribution P{{si}, N,T,l). Indeed, at the critical point, the logarithm 
of P'{{ei}, N,T,l) is expected to be not constant and proportional to (cp — Cr)logl -|- (7 as soon as 

Cy Cp . 

In more detail, we can consider both log P'{{si}, N,T,l) and log P'{{si}, N,T,l), which should 
allow to capture the behaviours of the average and typical correlation lengths respectively, follow¬ 
ing the picture put forward in [5l [6] (with in fact the second case better described as a mixed 
average). The behaviour of these quantities for T ~ Tc, as shown in [3], was instead found compat- 
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ible with a smooth transition, but it is noticeable that the evaluated critical exponent Cr appears 
to depend on N. In fact, such dependency is more evident when considering the average quan¬ 
tity log P'{{£i}, N,Tc, 1), further displaying more important deviations from the expected behaviour 
(oc logl) than log P'{{ei}, N,T,l). 

In summary, the analysis of the data in [3] suggested the asymptotic condition Cr < 1.5, which is 
conhrmed by our present evaluation Cr = 1.35 ±0.05 obtained from the pseudo-Tc behaviour. Never¬ 
theless, to further clarify the situation, attempting to better characterize the hnite size behaviour of 
the present model, we are led to study in detail both log P'{{£i}, N, 1) and log P'({ej}, N, T, 1) on 
the whole relevant T-range, by introducing effective (N-dependent) critical exponents Cr^i{N) and 
Cr, 2 {N)-, as well as correlation lengths ^i{N,T) and 

We found that the data concerning log P'{{£i},N,T,l) are accurately described by the ex¬ 
pected scaling law, which can be easily obtained from Eq. ([9]). Accordingly, the behaviour of 
logP'{{£i}, N,T,l), which should be ruled by the fluctuations of the pseudo-Tc, is in agreement 
with: 

log P>{{£i}, N, T, 1) ~ (cp - c,, 2(A^)) log I - l/UN, T) + C{N). (22) 

On the other hand, in order to obtain a satisfactory fit (within the errors) for the data concerning 
logP'({ej}, N,T,l) (with such analysis expected to better capture the typical loop-length probability 
distribution behaviour), it appears necessary to introduce, in addition, a quadratic contribution in 
I, i.e.: 

log P>{{£i},N,T,l) ~ {cp - Cr,i{N))logl - l/UN,T) - Ci{N)l^ + C 2 {N), (23) 

with the constant Ci{N) tending towards zero roughly as 1/N. 

Such approach appears adequate to describe the strong finite size corrections to scaling charac¬ 
terizing the present model, and above all it allows a clear definition of N*. In fact, the crossover 
between a short chain length regime in which the effective exponents would be in agreement with a 
pseudo hrst order transition and a long chain length one in which they would be compatible with the 
asymptotic values turns out to be quite abrupt. Therefore, one can obtain a quantitative evaluation 
of the crossover chain length (which is an evaluation from above), as the length N* beyond which 
Cr^i{N) ~ Cr.^2(A^) ^ Cr 1.35 and lluirp^rp- ^i{N,T) ~ limy_^^-^2(A^) ?") ~ {Tc — , with 

1/l'r = Cr — I — 0.35. 
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Figure 9: a) Evaluation, for the various chain lengths considered, of Cp — 0^,2(A^) from the hts (three 
parameters) of log P'{{£i},N,T,l) to Eq. (|2^ . as function of temperature, b) Evaluation, for the 
various chain lengths considered, ot Cp — Cr^i{N) from the fits (four parameters) ot log P'{{£i},N,T,l) 
to Eq. (j23p . as function of temperature. For both analyses the errors are only of indicative value, 
as the results display some dependence on the Z-range (with the range I G [3, N/3] considered 
corresponding to a reasonable choice), in particular for the shortest chain lengths. For comparisons, 
the expected asymptotic behaviour (cp — Cr ^ 0.8) is also represented in both panels. 


The detailed results for the loop-length probability distribution exponents and the inverse of the 
correlation lengths are presented in [Fig. [9] and [Fig. [10], respectively, associated with Eq. (I22|) and 
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Figure 10: a) Evaluation of the inverse of the correlation length as function of temperature for the 
various chain lengths considered, from the fits (three parameters) oi log P'{{si},N,T,l) to Eq. (|2^ . 
b) Evaluation of the inverse of the correlation length ^i{N,T) as function of temperature for the 
various chain lengths considered, from the fits (four parameters) of log P'{{si},N,T,l) to Eq. (l23]l . 
The same remarks as in [Eig. 9] apply to the significance of errors. Eor comparisons, the expected 
asymptotic behaviour (1/^(T) oc (Tc — T)^'®) is also represented in both panels. 


Eq. (PH]l in each of the two cases. More precisely, the figures plot the evaluations of Cp — 0^,2 (-^)) 
Cp — Cr^i{N), l/^i{N,T) and 1/^2(-^jF), as obtained from the fits above, as functions of temperature 
for the different sizes considered. These plots also allow to compare the results with Cp — Cr — 0.8 
and l/^(r) ~ (Tc — T)^'® for T —Tp (the asymptotic inverse of the correlation length being zero 
in the whole high temperature phase). 

The plots in [Eig. [9] and [Fig. [TO] clearly illustrate the strong A^-dependence of the quantities 
considered. Nonetheless, as shown in particular in [Fig. [9^], for chain lengths N > 5000, by allowing 
for a non-zero effective l/^ 2 (-^;F), Cp — Cr^2{^) evaluations from log P'{{£i}, N,T,l) are consistent 
with Cr — 1.35 over a large temperature range around Tc. Accordingly, for the disordered PS 
model with Cp = 2.15 studied, it appears that N* = 2500 5000 can be assimilated to our best 

evaluation from above of the crossover length N*. This conclusion also holds for the analysis of 
logi-"({ej}, A^, T,/) data, even though involving in this case more significant corrections. 

It is also very illustrative to notice that limiting the analysis to chain lengths N < 2500 can 
lead to interpretations significantly different from those obtained considering the long chain length 
regime: one would get from [Fig. [ 9 ) 3 ] an estimate for Cp — Cr^i{N) essentially compatible with the 
zero value characteristic of the pure case (i.e. = Cp = 2.15 which would imply Vr,! = t'p = 1), as 

well as (from [Fig. [9^]) an estimate for Cr,2(A^) not significantly larger than 1.5 {i.e. Cr,2 = 1 + l/i^r,2 
with iyr ,2 = 2). With this respect, the analysis here confirms, that in the regime below the crossover 
chain length N*, it is difficult to extrapolate the correct thermodynamic limit behaviour on general 
grounds, with the results being indeed interpretable in such context according to the picture implying 
a pseudo first order phase transition as proposed in [SllS]. 

The importance of finite size effects is further highlighted in [Fig. [TO]. This Figure allows to 
compare the behaviour of 1 /^ 2 (A^,F) ([Fig. [T(Tk]: as obtained from Eq. ([^ ). with that of l/.^i(A^,T) 
([Fig. fTObj: as obtained from Eq. (f23l) b with the plots clearly showing that the two quantities 
approach the asymptotic limit oc {Tc — T)^’’ with I'r ~ 2.9 from opposite sides. In particular, for 
short chain lengths N < N*, it appears that the value of 1/^2 (F,A^) is definitely different from zero, 
on the whole T-range considered. It can be also observed that, in the same short chain length regime, 
for the average performed after taking the logarithm, negative (physically meaningless) values are 
obtained for ^i(r, A^) on a large part of the T-range (bearing in mind that in this case it appears 
necessary to insert also a quadratic term in I in order to perform the data fit). Even though the data 
here do not allow a more quantitative analysis, these results further imply that below the crossover 
(i.e. for chain lengths N < 2500) evidence is observed for the presence of two correlation lengths 
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Figure 11: Plots of 1/^2{N, T) and l/^i(A^, T) on log-log scale as functions of {T^ — T) (with Tc ~ 1.1) 
for the three longest chain lengths considered {N = 10000, N = 15000 and N = 20000; same data as 
in [Fig. HU]). Data are plotted against the expected asymptotic behavionr of the inverse correlation 
length l/^(r) oc {Tc — TY'", with Tc = 1.1 and Vr = 2.9. For comparisons, the behaviour of 
l/^(r) oc {Tc — T) is also displayed. 




Figure 12: a) Plots of log P'{{ei},N,Tc,l) as functions of ^ at T = Tc ~ 1.1, for the different chain 
lengths considered, b) Plots of log P'{{si}, N, Tc, 1), as functions of Z at T = Tc 1.1, for the different 
chain lengths considered. For both a) and b) the averages are performed by taking into acconnt the 
pseudo-critical temperature, following Eq. (HZl). All curves are shifted arbitrarily, setting them to 
zero for I = 2. The expected asymptotic limit behaviour oc {cp — Cr) log I + C, with Cp — Cr = 0.8 {i.e. 
Cr = 1.35), is also plotted, in both panels, as a dotted line. 
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(and accordingly for two different critical exponents Vr ,2 7^ 

On the other hand, the behaviours of and ^2 for N > N* are in agreement over a large T- 
range, within the error margins, with a correlation length exponent Vr — 2.9, as clearly shown in 
[Fig. [TT] displaying the plots of and T) for the three largest sizes considered. 

The fact that the two quantities converge towards the same behaviour is particularly meaningful: 
these observables are very demanding to be measured, and moreover they are the ones expected to 
be the most affected by corrections to scaling. Therefore, the result confirms that for N 10^ the 
asymptotic regime of the model considered is definitely reached. 

Finally, for the different chain lengths, [Fig. [12] displays plots of log Tc,/) and 

logP'({ej}, N, Tc, 1) with the averages taking into account the pseudo-critical temperature, following 
Eq. ([T7|l . More precisely, in this case, for any given sequence, the loop-length probability distri¬ 
bution which contributes to the average is the one evaluated at the Tc{{£i}, N), where the suscep¬ 
tibility of the sequence reaches its absolute maximum. Upon comparing these plots to those of 
log P'{{si}, N,Tc,l) and log P'{{£i},N,Tc,l) in [1] (respectively [Fig. 6] and [Fig. 9], in this refer¬ 
ence), it becomes once more clear that the analysis in terms of pseudo-critical temperatures allows 
to reduce the importance of finite size corrections to scaling. This can be seen first of all in the fact 
that here data corresponding to average after taking the logarithm display a behaviour more similar 
to the one in which the logarithm is taken after the average. Moreover both quantities approach 
qualitatively more rapidly the asymptotic behaviour oc (cp — Cr)logl + C, with Cr ^ 1.35, in the 
present case. 


5.2 Dependence of the crossover chain length on the model parameters 

On general grounds, the rounding of the transition due to disorder in the present model should be 
mainly attributable to the presence of rare regions in the sequences (as expected in particular from 
the theoretical results m), or otherwise stated to the importance of the atypical events Illlli. 
Accordingly, we attempt to quantify roughly their contribution in the hnite size behaviour of the 
system, to better understand the way in which relevance of disorder becomes manifest, starting from 
the finite size level. This approach relies on the numerical evaluation here of N* for the particular 
model considered, as well as on the previonsly proposed very simplified phenomenological picture 

i- 

Qualitatively speaking, the approach is expected to be meaningful in the present case with 
Cp > 2, corresponding to a first order transition in the pure model. Indeed, from the finite size 
behaviour point of view, this model appears to be characterized in general by a slow approach 
to the asymptotic regime, and the evidence for the effect of disorder (leading to the predicted 
smooth transition described by a single correlation length) appears to be related to the quite sudden 
appearance, at ~ N*, of enhanced (with respect to the Cp <2 case) multi-step behaviours in the 
order parameter (and accordingly of different peaks in the snsceptibility) in a signihcant fraction of 
the sequences considered. This feature can then be expected to be related to the presence in snch 
sequences of large enough rare regions, which wonld then be already occurring, at A^ ~ N* , with not 
negligible probability. 

It is worth recalling that in [3| we showed notably that the relevant quantity from this point of 
view is expected to be the adimensional ratio between the rare region length L (i.e. the number of 
consecutive base pairs of the same kind) and the value of the parameter: 


R 1 
R — 1 log /i ’ 


(24) 


in the particular model a la PS with Cp = 2.15 considered. This quantity is therefore to be interpreted 
as a measure of the effective disorder strength, linking the concept of large enough rare region to 
the values ol R = egcI^at and log// in the model. 

Accordingly, we are led to the combinatorial problem of evaluating the probability V{N,L) to 
observe, in a sequence of total length N, a snbsequence of consecutive base pairs of the same kind 
of length at least L. Such evaluation could appear to be rather simple. However even the obtention 
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of the exact solution for the binomial quenched disordered variables distribution given by Eq. Q 
appears to be far from trivial, and we resort here to the approximation: 

V{N,L) ^ {N - L+ 1)2-^. (25) 

This approximation is exact for L = N, but clearly invalid for small L values, leading to probabilities 
higher than 1. Nonetheless, it can be considered that this approximation provides a reasonable basis 
for the present analysis, as we are mainly interested to capture the order of magnitude of the quantity 
(we checked numerically, by exact computations, that this is roughly the case, in a large L range, 
already for small N values). 

As already outlined above, at the crossover it is expected to find large enough rare regions with 
a non-negligible probability. Here we set this probability to the value 0.5, based on the observation 
that the order parameter displays multi-step behaviour at ~ N* (i.e. multi-peak behaviour of 
the susceptibility) in a fraction of order o(l) of the sequences considered. Accordingly, by applying 
Eq. (|2^ with V{N*^ L*) = 0.5 (hence finding numerically the solution L*{N*) of (A^* —L*-|-l)2“^* = 
0.5), we get an estimation of the crossover rare region length L* ~ 14, in correspondence with the 
crossover chain length N* ~ 10^ ^ 10^ which characterizes the present model. Assuming that the 
phenomenological picture does indeed capture the basic physics of the problem, and recalling that 
one has x = xqy — 1-3 in the present case, it is now in general easy to predict the behaviours of 
L*{x) and of the corresponding crossover chain length N*{x) as functions of the parameter x in 
models d la PS for DNA denaturation transition with Cp = 2.15. In fact, in order for the effect of 
disorder to be equally manifest in different models, it is expected to be necessary for the underlying 
ratio values L*{x)/x (of the crossover rare region length to the parameter x) to be similar [1]. Such 
conditions can be expressed as L*{x) = L*{xcy)x/xcy- By using xqy = 1-3 with L*{xcy) = 14, it 
is then possible to obtain N*{x) by solving once again Eq. with V{N*,L*) = 0.5. 

In this way, for the PS model with Cp = 2.15 in [5l[6], we hnd extremely large N* values of order 
10^*^-^10^*^. Indeed, the link energies and the connectivity constant used in [5l[6] (which we recalled in 
[Tab. 3]) lead to x = xgm — 16. Taking also into account the somehow different law for the allowed 
coupling adopted in that study, the effective value of the parameter xgm is in fact expected to be 
still larger. Even though relying on rough evaluations, it appears therefore reasonable to consider 
that with the conditions in 13 E] it would be impossible, in practice, to reach the asymptotic regime. 
In such context, the PS model with Cp = 2.15 in 13 [6] is in fact indeed expected to behave in 
agreement with a pseudo first order transition, ruled by two correlation lengths (corresponding to 
typical and average quantities; namely with = i/p = 1 and iyr ,2 = 2), independently from the 
observable considered and even well beyond the (already extremely large) chain lengths studied (up 
to A^ = 2 • 10®). 

Thus, in the context of the analysis above, it appears possible to reconcile the two different 
pictures emerging from the previous numerical studies IMl 13 0 E]- Further, in such context, the 
scenario associated with the pseudo hrst order transition appears meaningful to describe the finite- 
N behaviour in the presence of weak disorder. In this case, it would be moreover interesting to 
clarify the importance of boundary conditions, i.e. whether the second correlation length is better 
interpreted as the one related to the free-end distance or to an average loop length, with in fact the 
results here concerning the loop-length probability distribution favouring the second alternative. 

Finally, it is worth noting that the xcy — 1.3 of the present model, which pertains to the region 
associated with large finite size effects (yet possible to study), is definitely closer to the still smaller 
Xexp — 0.2 associated with the values of the coupling energies and of log /r typically adopted for 
comparisons with experimental melting curves. In detail, the value of R adopted for comparison 
with experiments is essentially the same (ii ~ 1.1) than the one in 0[6], with, however, underlying 
link energies more than an order of magnitude smaller and a log/i value more than an order of 
magnitude larger. The simple approximation of P{L, N) considered may not be sufficient for precise 
predictions of and N*^p values. However, it seems reasonable to expect crossover rare region 
length of the order of a few base pairs, and a corresponding relatively small crossover chain length. 
From this point of view, the present results on the studied PS model with Cp = 2.15 suggest that 
both self-avoidance and disorder could play a key role in experimental DNA behaviour. 
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Nevertheless, apart from the possible effect of neglecting correlations in the sequences, which are 
well known to be present in DNA molecules, a throughout understanding of this phenomenon would 
also require to better clarify the role of the cooperativity factor cjo in PS models. Indeed, in order to 
reproduce correctly the experimental curves, usually very small values, do 10 are adopted 

for this parameter (see in particular |49] 1. It would be also meaningful to further characterize the 
influence of this parameter, of non-universal character, on the finite size behaviour of the system 
(corresponding roughly to the introduction of an additional correlation length ~ l/o'o)- 

6 Conclusions 

We performed an extensive numerical analysis of a disordered PS model with Cp = 2.15, hence with 
the loop-length probability distribution exponent of the pure system taking completely into account 
self-avoidance. The analyses are following two main directions: i) a finite size scaling study in terms 
of appropriately defined pseudo-critical temperatures and ii) an attempt to characterize the peculiar 
finite size behaviour of the model. 

Completing previous numerical results isnu, and in agreement with the predictions based on 
a probabilistic mathematical approach [HEIEIE!, the present work notably provides evidence that 
the thermodynamic limit behaviour of the model is coherent with the picture describing random 
ferromagnets [331 ED: a smooth transition described by a single correlation length (with the refined 
estimation here for the corresponding critical exponent i^r = 2.9 ± 0.4). 

The pseudo-critical temperature itself, that we define for taking appropriately into account the 
possible presence of multiple steps in the order parameter for a given sequence (checking its compat¬ 
ibility with a different definition suggested in OE] for the same model) appears to be a particularly 
interesting observable to study. In detail, we find that the mean value and the fluctuations of this 
pseudo-critical temperature agree well with the expected scaling laws in the typical scenario corre¬ 
sponding to disorder relevance. And accordingly, the two quantities appear to be ruled by the same 
critical exponent (the one corresponding to the correlation length), on the whole Wrange considered, 
without displaying the strong corrections to scaling observed in other quantities. 

On this basis, it is also interesting to notice that the refined estimation of i^r obtained with this 
analysis is further compatible with the estimation ~ 2.7 given in |6], for a PS model with the 
different Cp = 1.75. Therefore, the results here could also support the hypothesis that the random 
PS model critical behaviour is independent on the Cp value, as soon as the pure PS model undergoes 
a transition characterized by a diverging specific heat. 

From this point of view, it can be nevertheless noticed that we focused here on the simplest 
possible scaling picture, therefore that the present study does not rule out the possibility that the 
complete description of the random system could involve more than a single independent critical 
exponent. In such case, it could be possible to get a scaling law obeyed by the order parameter 
and a better scaling of the loop-length probability distribution at T ^ Tc- However such possibility, 
whose exploration would represent a new subject, would clearly not impact the present result on 
and it is not expected to impact neither the present results on the characterization of the finite size 
behaviour. 

Moreover, in any event, the observation that in the present model the order parameter behaves 
as the energy, and the susceptibility as the specific heat, could be relevant only for the disordered 
regime considered. More specifically, we did not investigate here the strong disorder regime of the 
model. Indeed, for £gc » ^AT: the observed behaviour could be more complex, since models d la 
PS are expected to display an additional singularity of the Kosterlitz-Thouless kind [50l ICT [52] . 

The present analysis also shows qualitatively that the finite size corrections are in general smaller 
when quantities are averaged taking into account the pseudo-critical temperatures. Finally, based 
on the study of an appropriately defined non self-averageness parameter, evidence is provided that 
the susceptibility is a strongly non self-averaging quantity at the critical point, with its detailed 
behaviour further allowing to get an alternative evaluation of both Tc and i^r, compatible with our 
previous ones. 

In this background, mainly in the second part of the work, we seek a better understanding of 
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finite size effects. The data for the maximum of the susceptibility, and even more so the detailed 
study of the behaviour of the loop-length probability distribution, give weight to the existence of 
a crossover chain length N* below which one could not rule out the picture proposed in [51 E]. 
This alternative scenario, the pseudo first order phase transition one, is indeed found to capture 
the behaviour of a different PS model with the same Cp = 2.15 up to the extremely large sizes of 
N = 2 ■ 10®. In detail, we present a qualitative evaluation from below of N* > 1000 from data on 
the pseudo-Pc, and a more quantitative evaluation from above of N* < 2500 5000 from data on 

the loop-length probability distribution. 

We also provide a tentative quantification of the dependence of N* on the parameters of the 
model, with notably an evaluation of the crossover length behaviour which could allow to reconcile 
the results here with those in U E]: with the parameters chosen for the model in U M it could 
be not possible to reach the asymptotic regime in practice. In this context, we also find that the 
crossover chain length obtained for realistic parameter values (as used in experimental settings) 
should be dehnitely smaller than the present N* and, accordingly, our conclusions, concerning both 
the importance of self-avoidance and the relevance of disorder, should be also important for better 
modeling experimental DNA denaturation. 

It is hnally also interesting to stress that these extensive studies of the finite size behaviour, 
particularly in the case of PS models for DNA denaturation transitions, were made possible thanks 
to the recursive equations for the partition functions (notably within the SIMEX scheme), reducing 
essentially the time complexity of the problem to ln{N)N. 

In conclusion, it is possible that crossover effects, such as those described here, could be relevant 
in a larger class of disordered systems (at least in the behaviour of certain observables), allowing a 
better understanding of the finite size behaviour, and in particular of the way in which relevance of 
disorder becomes manifest when approaching the thermodynamic limit. In this sense the results and 
treatments here could provide interesting insights for the exploration of such disordered systems, 
on more general grounds. On the biological side, the results here should also contribute to a better 
understanding of the roles played by disorder and self-avoidance in experimental DNA denaturation, 
with notably a quantitative description of the finite size effect particularly important in this context. 
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